Subjective Intelligibility of Deep Neural Network-Based Speech Enhancement
نویسندگان
چکیده
Recent literature indicates increasing interest in deep neural networks for use in speech enhancement systems. Currently, these systems are mostly evaluated through objective measures of speech quality and/or intelligibility. Subjective intelligibility evaluations of these systems have so far not been reported. In this paper we report the results of a speech recognition test with 15 participants, where the participants were asked to pick out words in background noise before and after enhancement using a common deep neural network approach. We found that, although the objective measure STOI predicts that intelligibility should improve or at the very least stay the same, the speech recognition threshold, which is a measure of intelligibility, deteriorated by 4 dB. These results indicate that STOI is not a good predictor for the subjective intelligibility of deep neural network-based speech enhancement systems. We also found that the postprocessing technique of global variance normalisation does not significantly affect subjective intelligibility.
منابع مشابه
Objective Evaluation of a Deep Neural Network Approach for Single-channel Speech Intelligibility Enhancement
Single-channel speech intelligibility enhancement is much more difficult than multi-channel intelligibility enhancement. It has recently been reported that machine learning training-based single-channel speech intelligibility enhancement algorithms perform better than traditional algorithms. In this paper, the performance of a deep neural network method using a multiresolution cochlea-gram feat...
متن کاملMonaural Speech Enhancement using Deep Neural Networks by Maximizing a Short-Time Objective Intelligibility Measure
In this paper we propose a Deep Neural Network (DNN) based Speech Enhancement (SE) system that is designed to maximize an approximation of the Short-Time Objective Intelligibility (STOI) measure. We formalize an approximate-STOI cost function and derive analytical expressions for the gradients required for DNN training and show that these gradients have desirable properties when used together w...
متن کاملText-informed speech enhancement with deep neural networks
A speech signal captured by a distant microphone is generally contaminated by background noise, which severely degrades the audible quality and intelligibility of the observed signal. To resolve this issue, speech enhancement has been intensively studied. In this paper, we consider a text-informed speech enhancement, where the enhancement process is guided by the corresponding text information,...
متن کاملImproving Deep Neural Network Based Speech Enhancement in Low SNR Environments
We propose a joint framework combining speech enhancement (SE) and voice activity detection (VAD) to increase the speech intelligibility in low signal-noise-ratio (SNR) environments. Deep Neural Networks (DNN) have recently been successfully adopted as a regression model in SE. Nonetheless, the performance in harsh environments is not always satisfactory because the noise energy is often domina...
متن کاملSNR-Based Progressive Learning of Deep Neural Network for Speech Enhancement
In this paper, we propose a novel progressive learning (PL) framework for deep neural network (DNN) based speech enhancement. It aims at decomposing the complicated regression problem of mapping noisy to clean speech into a series of subproblems for enhancing system performances and reducing model complexities. As an illustration, we design a signal-tonoise ratio (SNR) based PL architecture by ...
متن کامل